Machine Learning - Vectorization

Table of Contents

This article explains what vectorization is on Machine Learning.

Parameters and Features in Linear Regression #

In linear regression with multiple variables, we deal with vectors of parameters and features. Consider the following vectors:

Parameters vector \( \vec{w} \) represents the weights: \( \vec{w} = [w_1 \ w_2 \ w_3]\)
Features vector \( \vec{x} \) represents the input features: \( \vec{x} = [x_1 \ x_2 \ x_3]\)

Representation in Code #

In programming, especially with Python and libraries like NumPy, we account for the base-0 index. For the vectors above, the code representation is as follows:

import numpy as np

w = np.array([1.0, 2.5, -3.3])  # Weights
b = 4  # Bias
x = np.array([10, 20, 30])  # Features

Linear Regression Model Without Vectorization #

The prediction \( f_{\vec{w},b}(\vec{x}) \) can be calculated without vectorization, using a straightforward approach:

f = w[0] * x[0] +
		w[1] * x[1] +
		w[2] * x[2] + b

Or, for a more generalized form considering any number of features \(n\):

f = 0
n = len(w)  # Assuming w and x have the same length
for j in range(0, n):
    f += w[j] * x[j]
f += b

Linear Regression Model With Vectorization #

Vectorization allows for a more efficient calculation by leveraging NumPy’s dot product function:

f = np.dot(w,x) + b

Updating Parameters with Gradient Descent #

Gradient descent is a method used to update the parameters \( \vec{w} \) n order to minimize the cost function. Consider a gradient vector \( \vec{d} \) and a learning rate of \(0.1\).

Without Vectorization

The update rule for each parameter \(w_j\) without vectorization would look like:

for j in range(0, 16):  # Assuming there are 16 parameters and gradients
    w[j] = w[j] - 0.1 * d[j]

With Vectorization

Vectorization simplifies the update process significantly:

w = w - 0.1 * d